Optimizing Relational Queries by Materializing Natural Joins

نویسندگان

  • Xun Cheng
  • Jianwen Su
چکیده

Efficient evaluation of user queries is very important and critical in database applications involving very large amount of data. In these applications, especially the ones where query answers are expected in real time, performance of query evaluation in a DBMS may be poor; often in such cases, query performance is extremely sensitive to the structure of the database schema. For example, if a query joins several very large relations (in terms of the number of tuples), the joins may be very expensive even with efficient algorithms. On the other hand, if such joins are “materialized”, i.e. precomputed, stored, and properly indexed, the joins can be avoided at the query evaluation time. In our experiments, the performance improvement by join elimination is significant. Based on the analysis of several large operational database application systems and experimental results, we argue that normalized database schemas should remain for the sake of semantic integrity (upon updates) and that some joins can be materialized to improve query performance. In this paper, we develop techniques to (1) select joins to be materialized based on query statistics and ER diagrams, (2) generate materialized joins with tags to avoid duplicate removal, and (3) translate queries on the base relations to equivalent ones using materialized joins. We also describe an architecture to embed this performance tuning method into existing application systems and a prototype implemented within the Alexandria Digital Library system. We report experimental results on this prototype.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimizing Multiple Top-K Queries over Joins

Advanced Data Mining applications require more and more support from relational database engines. Especially clustering applications in high dimensional features space demand a proper support of multiple Top-k queries in order to perform projected clustering. Although some research tackles to problem of optimizing restricted ranking (top-k) queries, there is no solution considering more than on...

متن کامل

Relational Databases Query Optimization using Hybrid Evolutionary Algorithm

Optimizing the database queries is one of hard research problems. Exhaustive search techniques like dynamic programming is suitable for queries with a few relations, but by increasing the number of relations in query, much use of memory and processing is needed, and the use of these methods is not suitable, so we have to use random and evolutionary methods. The use of evolutionary methods, beca...

متن کامل

Processing Inequality Queries

Bernstein and Goodman showed that natural inequality ( NI) queries can be processed efficiently by semijoins, if there are no multiple inequality join edges, nor cycles with one or zero doublet. In this paper procedures to hand1 e these cases efficiently are given. Multiple inequality join edges can be processed by multi-attribute inequality semijoins. Two procedures based on generalized semi-j...

متن کامل

[4] Chiang Lee, Chi-Sheng Shih, and Yaw-Huei Chen. Optimizing large join queries using a graph-based approach. IEEE Trans. Knowl. Data Eng., 13(2):298–315, 2001.

References [1] Leonidas Fegaras. A new heuristic for optimizing large queries. [2] Toshihide Ibaraki and Tiko Kameda. On the optimal nesting order for computing n-relational joins. Optimizing large join queries using a graph-based approach. [5] Guido Moerkotte and Thomas Neumann. Analysis of two existing and one new dynamic programming algorithm for the generation of optimal bushy join trees wi...

متن کامل

Optimizing database-backed applications with query synthesis Citation

Object-relational mapping libraries are a popular way for applications to interact with databases because they provide transparent access to the database using the same language as the application. Unfortunately, using such frameworks often leads to poor performance, as modularity concerns encourage developers to implement relational operations in application code. Such application code does no...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997